Implementing the Context Tree Weighting Method for Text Compression
نویسندگان
چکیده
Context tree weighting method is a universal compression algorithm for FSMX sources. Though we expect that it will have good compression ratio in practice, it is difficult to implement it and in many cases the implementation is only for estimating compression ratio. Though Willems and Tjalkens showed practical implementation using not block probabilities but conditional probabilities, it is used for only binary alphabet sequences. We extend the method for multi-alphabet sequences and show a simple implementation using PPM techniques. We also propose a method to optimize a parameter of the context tree weighting for binary alphabet case. Experimental results on texts and DNA sequences show that the performance of PPM can be improved by combining the context tree weighting and that DNA sequences can be compressed in less than 2.0 bpc.
منابع مشابه
The Context-tree Weighting Method: Extensions - Information Theory, IEEE Transactions on
First we modify the basic (binary) context-tree weighting method such that the past symbols x1 D; x2 D; ; x0 are not needed by the encoder and the decoder. Then we describe how to make the context-tree depth D infinite, which results in optimal redundancy behavior for all tree sources, while the number of records in the context tree is not larger than 2T 1: Here T is the length of the source se...
متن کاملA Study of the Context Tree Maximizing Method
One can adapt the context tree weighting method in such a way, that it will find the minimum description length model (MDL-model) that corresponds to the data. In this paper this new algorithm, the context tree maximizing algorithm, and a few modifications of the algorithm will be studied, in particular, its performance if we apply it for data compression.
متن کاملContext-Tree Weighting and Maximizing: Processing Betas
The context-tree weighting method (Willems, Shtarkov, and Tjalkens [1995]) is a sequential universal source coding method that achieves the Rissanen lower bound [1984] for tree sources. The same authors also proposed context-tree maximizing, a two-pass version of the context-tree weighting method [1993]. Later Willems and Tjalkens [1998] described a method based on ratios (betas) of sequence pr...
متن کاملArithmetic Coding with Adaptive Context-Tree Weighting for the H.264 Video Coders
We propose applying an adaptive context-tree weighting (CTW) method in the H.264 video coders. We first investigate two different ways to incorporating the CTW method into an H.264 coder and compare the coding effectiveness of using the method with that of using the context models specified in the H.264 standard. We then describe a novel approach for automatically adapting the CTW method based ...
متن کاملThe Context-Tree Weighting Method : Extensions
First we modify the basic (binary) context-tree weighting method such that the past symbols x1 D; x2 D; ; x0 are not needed by the encoder and the decoder. Then we describe how to make the context-tree depth D infinite, which results in optimal redundancy behavior for all tree sources, while the number of records in the context tree is not larger than 2T 1: Here T is the length of the source se...
متن کامل